by Mercy Nyambura Kariuki
The World Happiness Report is a landmark survey of the state of global happiness. The first report was published in 2012, the second in 2013, the third in 2015, and the fourth in the 2016 Update. The World Happiness 2017, which ranks 155 countries by their happiness levels, was released at the United Nations at an event celebrating International Day of Happiness on March 20th. The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.
The happiness scores and rankings use data from the Gallup World Poll. The scores are based on answers to the main life evaluation question asked in the poll. This question, known as the Cantril ladder, asks respondents to think of a ladder with the best possible life for them being a 10 and the worst possible life being a 0 and to rate their own current lives on that scale. The scores are from nationally representative samples for the years 2013-2016 and use the Gallup weights to make the estimates representative. The columns following the happiness score estimate the extent to which each of six factors – economic production, social support, life expectancy, freedom, absence of corruption, and generosity – contribute to making life evaluations higher in each country than they are in Dystopia, a hypothetical country that has values equal to the world’s lowest national averages for each of the six factors. They have no impact on the total score reported for each country, but they do explain why some countries rank higher than others.
Stage 1: Data Acquisition¶
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline
Get the Data
happiness2015=pd.read_csv("C:/Users/kariu/OneDrive/Desktop/Data Science/Google_Capstone_Project/Choose Your Own Dataset/World Happiness Report/2015.csv")
happiness2016=pd.read_csv("C:/Users/kariu/OneDrive/Desktop/Data Science/Google_Capstone_Project/Choose Your Own Dataset/World Happiness Report/2016.csv")
happiness2017=pd.read_csv("C:/Users/kariu/OneDrive/Desktop/Data Science/Google_Capstone_Project/Choose Your Own Dataset/World Happiness Report/2017.csv")
happiness2018=pd.read_csv("C:/Users/kariu/OneDrive/Desktop/Data Science/Google_Capstone_Project/Choose Your Own Dataset/World Happiness Report/2018.csv")
happiness2019=pd.read_csv("C:/Users/kariu/OneDrive/Desktop/Data Science/Google_Capstone_Project/Choose Your Own Dataset/World Happiness Report/2019.csv")
happiness2015.sample()
| Country | Region | Happiness Rank | Happiness Score | Standard Error | Economy (GDP per Capita) | Family | Health (Life Expectancy) | Freedom | Trust (Government Corruption) | Generosity | Dystopia Residual | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 | Brazil | Latin America and Caribbean | 16 | 6.983 | 0.04076 | 0.98124 | 1.23287 | 0.69702 | 0.49049 | 0.17521 | 0.14574 | 3.26001 |
happiness2016.sample()
| Country | Region | Happiness Rank | Happiness Score | Lower Confidence Interval | Upper Confidence Interval | Economy (GDP per Capita) | Family | Health (Life Expectancy) | Freedom | Trust (Government Corruption) | Generosity | Dystopia Residual | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 107 | Palestinian Territories | Middle East and Northern Africa | 108 | 4.754 | 4.649 | 4.859 | 0.67024 | 0.71629 | 0.56844 | 0.17744 | 0.10613 | 0.11154 | 2.40364 |
files = [happiness2015, happiness2016, happiness2017, happiness2018, happiness2019]
string =["happiness2015", "happiness2016", "happiness2017", "happiness2018", "happiness2019"]
for i,j in zip(string,files):
print(i)
print("**********")
print(j.shape)
print((j.columns.values))
print()
happiness2015 ********** (158, 12) ['Country' 'Region' 'Happiness Rank' 'Happiness Score' 'Standard Error' 'Economy (GDP per Capita)' 'Family' 'Health (Life Expectancy)' 'Freedom' 'Trust (Government Corruption)' 'Generosity' 'Dystopia Residual'] happiness2016 ********** (157, 13) ['Country' 'Region' 'Happiness Rank' 'Happiness Score' 'Lower Confidence Interval' 'Upper Confidence Interval' 'Economy (GDP per Capita)' 'Family' 'Health (Life Expectancy)' 'Freedom' 'Trust (Government Corruption)' 'Generosity' 'Dystopia Residual'] happiness2017 ********** (155, 12) ['Country' 'Happiness.Rank' 'Happiness.Score' 'Whisker.high' 'Whisker.low' 'Economy..GDP.per.Capita.' 'Family' 'Health..Life.Expectancy.' 'Freedom' 'Generosity' 'Trust..Government.Corruption.' 'Dystopia.Residual'] happiness2018 ********** (156, 9) ['Overall rank' 'Country or region' 'Score' 'GDP per capita' 'Social support' 'Healthy life expectancy' 'Freedom to make life choices' 'Generosity' 'Perceptions of corruption'] happiness2019 ********** (156, 9) ['Overall rank' 'Country or region' 'Score' 'GDP per capita' 'Social support' 'Healthy life expectancy' 'Freedom to make life choices' 'Generosity' 'Perceptions of corruption']
Stage 2: Data Wrangling¶
Let's map countries to regions
2015 and 2016 have a Region Column. 2017 doesn't have Region column. 2018 and 2019 have a Country or Region column where the values are country names.
We will pick region codes from happiness2015 and add Region to happiness2019, happiness2018 and happiness2017.
All the data sets have different column names. Let's also standard the column names as below:
- "Rank"
- "Country"
- "Region"
- "Happiness Score"
- "GDP per Capita"
- "Social Support"
- "Health (Life Expectancy)"
- "Freedom"
- "Trust"
country_region=happiness2015[['Country', 'Region']]
country_region_dictionary=dict(zip(country_region['Country'],
country_region['Region']))
2019¶
happiness2019.columns=["Rank", "Country", "Happiness Score", "GDP per Capita", "Social Support", "Health (Life Expectancy)", "Freedom", "Generosity", "Trust"]
happiness2019["Region"]=happiness2019["Country"].map(country_region_dictionary)
happiness2019.sample()
| Rank | Country | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | Region | |
|---|---|---|---|---|---|---|---|---|---|---|
| 52 | 53 | Latvia | 5.94 | 1.187 | 1.465 | 0.812 | 0.264 | 0.075 | 0.064 | Central and Eastern Europe |
happiness2019=happiness2019[["Rank", "Country","Region","Happiness Score","GDP per Capita", "Social Support","Health (Life Expectancy)","Freedom", "Generosity","Trust"]]
happiness2019.sample()
| Rank | Country | Region | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | |
|---|---|---|---|---|---|---|---|---|---|---|
| 126 | 127 | Congo (Kinshasa) | Sub-Saharan Africa | 4.418 | 0.094 | 1.125 | 0.357 | 0.269 | 0.212 | 0.053 |
2018¶
happiness2018.columns =["Rank", "Country","Happiness Score","GDP per Capita", "Social Support",'Health (Life Expectancy)',"Freedom", "Generosity","Trust"]
happiness2018["Region"]=happiness2018["Country"].map(country_region_dictionary)
happiness2018=happiness2018[["Rank", "Country","Region","Happiness Score","GDP per Capita", "Social Support","Health (Life Expectancy)","Freedom", "Generosity","Trust"]]
happiness2018.sample()
| Rank | Country | Region | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | |
|---|---|---|---|---|---|---|---|---|---|---|
| 89 | 90 | Jordan | Middle East and Northern Africa | 5.161 | 0.822 | 1.265 | 0.645 | 0.468 | 0.13 | 0.134 |
2017¶
happiness2017=happiness2017.drop(["Whisker.high", "Whisker.low", "Dystopia.Residual"], axis=1)
happiness2017.columns =["Rank", "Country","Happiness Score","GDP per Capita", "Social Support",'Health (Life Expectancy)',"Freedom", "Generosity","Trust"]
happiness2017["Region"]=happiness2017["Country"].map(country_region_dictionary)
happiness2017=happiness2017[["Rank", "Country","Region","Happiness Score","GDP per Capita", "Social Support","Health (Life Expectancy)","Freedom", "Generosity","Trust"]]
happiness2017.sample()
| Rank | Country | Region | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | |
|---|---|---|---|---|---|---|---|---|---|---|
| 80 | Indonesia | 81 | NaN | 5.262 | 0.995539 | 1.274445 | 0.492346 | 0.443323 | 0.611705 | 0.015317 |
2016¶
happiness2016=happiness2016.drop(["Lower Confidence Interval", "Region", "Upper Confidence Interval", "Dystopia Residual"], axis=1)
happiness2016.columns =["Rank", "Country","Happiness Score","GDP per Capita", "Social Support",'Health (Life Expectancy)',"Freedom", "Generosity","Trust"]
happiness2016["Region"]=happiness2016["Country"].map(country_region_dictionary)
happiness2016=happiness2016[["Rank", "Country","Region","Happiness Score","GDP per Capita", "Social Support","Health (Life Expectancy)","Freedom", "Generosity","Trust"]]
happiness2016.sample()
| Rank | Country | Region | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | |
|---|---|---|---|---|---|---|---|---|---|---|
| 132 | Sudan | 133 | NaN | 4.139 | 0.63069 | 0.81928 | 0.29759 | 0.0 | 0.10039 | 0.18077 |
happiness2015=happiness2015.drop(["Standard Error", "Dystopia Residual"], axis=1)
happiness2015.columns =["Country","Region", "Rank", "Happiness Score","GDP per Capita", "Social Support",'Health (Life Expectancy)',"Freedom", "Generosity","Trust"]
happiness2015=happiness2015[["Rank", "Country","Region","Happiness Score","GDP per Capita", "Social Support","Health (Life Expectancy)","Freedom", "Generosity","Trust"]]
happiness2015.sample()
| Rank | Country | Region | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | |
|---|---|---|---|---|---|---|---|---|---|---|
| 12 | 13 | Austria | Western Europe | 7.2 | 1.33723 | 1.29704 | 0.89042 | 0.62433 | 0.18676 | 0.33088 |
Check column names are the same in all the dataframes
set(happiness2015.columns)==set(happiness2016.columns)==set(happiness2017.columns)==set(happiness2018.columns)==set(happiness2019.columns)
True
Check for Null values in the dataframes
for i, j in zip(string, files):
print(i, "has", j.shape, "Rows X Columns")
print("************")
print(j.isnull().sum())
print("\n")
happiness2015 has (158, 12) Rows X Columns ************ Country 0 Region 0 Happiness Rank 0 Happiness Score 0 Standard Error 0 Economy (GDP per Capita) 0 Family 0 Health (Life Expectancy) 0 Freedom 0 Trust (Government Corruption) 0 Generosity 0 Dystopia Residual 0 dtype: int64 happiness2016 has (157, 13) Rows X Columns ************ Country 0 Region 0 Happiness Rank 0 Happiness Score 0 Lower Confidence Interval 0 Upper Confidence Interval 0 Economy (GDP per Capita) 0 Family 0 Health (Life Expectancy) 0 Freedom 0 Trust (Government Corruption) 0 Generosity 0 Dystopia Residual 0 dtype: int64 happiness2017 has (155, 12) Rows X Columns ************ Country 0 Happiness.Rank 0 Happiness.Score 0 Whisker.high 0 Whisker.low 0 Economy..GDP.per.Capita. 0 Family 0 Health..Life.Expectancy. 0 Freedom 0 Generosity 0 Trust..Government.Corruption. 0 Dystopia.Residual 0 dtype: int64 happiness2018 has (156, 10) Rows X Columns ************ Rank 0 Country 0 Happiness Score 0 GDP per Capita 0 Social Support 0 Health (Life Expectancy) 0 Freedom 0 Generosity 0 Trust 1 Region 6 dtype: int64 happiness2019 has (156, 10) Rows X Columns ************ Rank 0 Country 0 Happiness Score 0 GDP per Capita 0 Social Support 0 Health (Life Expectancy) 0 Freedom 0 Generosity 0 Trust 0 Region 7 dtype: int64
happiness2018 ahd happiness2019 have inconsistent values.
Let's view which columns have the Null values
print(happiness2018[["Country", "Region"]][happiness2018["Region"].isnull()])
Country Region 37 Trinidad & Tobago NaN 48 Belize NaN 57 Northern Cyprus NaN 97 Somalia NaN 118 Namibia NaN 153 South Sudan NaN
print(happiness2019[["Country", "Region"]][happiness2019["Region"].isnull()])
Country Region 38 Trinidad & Tobago NaN 63 Northern Cyprus NaN 83 North Macedonia NaN 111 Somalia NaN 112 Namibia NaN 119 Gambia NaN 155 South Sudan NaN
"Trinidad & Tobago" in country_region_dictionary
False
Let's fix the null values of happiness2018
Map Null region based on the region of nearby contry¶
set(zip(happiness2015["Country"], happiness2015["Region"]))
{('Afghanistan', 'Southern Asia'),
('Albania', 'Central and Eastern Europe'),
('Algeria', 'Middle East and Northern Africa'),
('Angola', 'Sub-Saharan Africa'),
('Argentina', 'Latin America and Caribbean'),
('Armenia', 'Central and Eastern Europe'),
('Australia', 'Australia and New Zealand'),
('Austria', 'Western Europe'),
('Azerbaijan', 'Central and Eastern Europe'),
('Bahrain', 'Middle East and Northern Africa'),
('Bangladesh', 'Southern Asia'),
('Belarus', 'Central and Eastern Europe'),
('Belgium', 'Western Europe'),
('Benin', 'Sub-Saharan Africa'),
('Bhutan', 'Southern Asia'),
('Bolivia', 'Latin America and Caribbean'),
('Bosnia and Herzegovina', 'Central and Eastern Europe'),
('Botswana', 'Sub-Saharan Africa'),
('Brazil', 'Latin America and Caribbean'),
('Bulgaria', 'Central and Eastern Europe'),
('Burkina Faso', 'Sub-Saharan Africa'),
('Burundi', 'Sub-Saharan Africa'),
('Cambodia', 'Southeastern Asia'),
('Cameroon', 'Sub-Saharan Africa'),
('Canada', 'North America'),
('Central African Republic', 'Sub-Saharan Africa'),
('Chad', 'Sub-Saharan Africa'),
('Chile', 'Latin America and Caribbean'),
('China', 'Eastern Asia'),
('Colombia', 'Latin America and Caribbean'),
('Comoros', 'Sub-Saharan Africa'),
('Congo (Brazzaville)', 'Sub-Saharan Africa'),
('Congo (Kinshasa)', 'Sub-Saharan Africa'),
('Costa Rica', 'Latin America and Caribbean'),
('Croatia', 'Central and Eastern Europe'),
('Cyprus', 'Western Europe'),
('Czech Republic', 'Central and Eastern Europe'),
('Denmark', 'Western Europe'),
('Djibouti', 'Sub-Saharan Africa'),
('Dominican Republic', 'Latin America and Caribbean'),
('Ecuador', 'Latin America and Caribbean'),
('Egypt', 'Middle East and Northern Africa'),
('El Salvador', 'Latin America and Caribbean'),
('Estonia', 'Central and Eastern Europe'),
('Ethiopia', 'Sub-Saharan Africa'),
('Finland', 'Western Europe'),
('France', 'Western Europe'),
('Gabon', 'Sub-Saharan Africa'),
('Georgia', 'Central and Eastern Europe'),
('Germany', 'Western Europe'),
('Ghana', 'Sub-Saharan Africa'),
('Greece', 'Western Europe'),
('Guatemala', 'Latin America and Caribbean'),
('Guinea', 'Sub-Saharan Africa'),
('Haiti', 'Latin America and Caribbean'),
('Honduras', 'Latin America and Caribbean'),
('Hong Kong', 'Eastern Asia'),
('Hungary', 'Central and Eastern Europe'),
('Iceland', 'Western Europe'),
('India', 'Southern Asia'),
('Indonesia', 'Southeastern Asia'),
('Iran', 'Middle East and Northern Africa'),
('Iraq', 'Middle East and Northern Africa'),
('Ireland', 'Western Europe'),
('Israel', 'Middle East and Northern Africa'),
('Italy', 'Western Europe'),
('Ivory Coast', 'Sub-Saharan Africa'),
('Jamaica', 'Latin America and Caribbean'),
('Japan', 'Eastern Asia'),
('Jordan', 'Middle East and Northern Africa'),
('Kazakhstan', 'Central and Eastern Europe'),
('Kenya', 'Sub-Saharan Africa'),
('Kosovo', 'Central and Eastern Europe'),
('Kuwait', 'Middle East and Northern Africa'),
('Kyrgyzstan', 'Central and Eastern Europe'),
('Laos', 'Southeastern Asia'),
('Latvia', 'Central and Eastern Europe'),
('Lebanon', 'Middle East and Northern Africa'),
('Lesotho', 'Sub-Saharan Africa'),
('Liberia', 'Sub-Saharan Africa'),
('Libya', 'Middle East and Northern Africa'),
('Lithuania', 'Central and Eastern Europe'),
('Luxembourg', 'Western Europe'),
('Macedonia', 'Central and Eastern Europe'),
('Madagascar', 'Sub-Saharan Africa'),
('Malawi', 'Sub-Saharan Africa'),
('Malaysia', 'Southeastern Asia'),
('Mali', 'Sub-Saharan Africa'),
('Malta', 'Western Europe'),
('Mauritania', 'Sub-Saharan Africa'),
('Mauritius', 'Sub-Saharan Africa'),
('Mexico', 'Latin America and Caribbean'),
('Moldova', 'Central and Eastern Europe'),
('Mongolia', 'Eastern Asia'),
('Montenegro', 'Central and Eastern Europe'),
('Morocco', 'Middle East and Northern Africa'),
('Mozambique', 'Sub-Saharan Africa'),
('Myanmar', 'Southeastern Asia'),
('Nepal', 'Southern Asia'),
('Netherlands', 'Western Europe'),
('New Zealand', 'Australia and New Zealand'),
('Nicaragua', 'Latin America and Caribbean'),
('Niger', 'Sub-Saharan Africa'),
('Nigeria', 'Sub-Saharan Africa'),
('North Cyprus', 'Western Europe'),
('Norway', 'Western Europe'),
('Oman', 'Middle East and Northern Africa'),
('Pakistan', 'Southern Asia'),
('Palestinian Territories', 'Middle East and Northern Africa'),
('Panama', 'Latin America and Caribbean'),
('Paraguay', 'Latin America and Caribbean'),
('Peru', 'Latin America and Caribbean'),
('Philippines', 'Southeastern Asia'),
('Poland', 'Central and Eastern Europe'),
('Portugal', 'Western Europe'),
('Qatar', 'Middle East and Northern Africa'),
('Romania', 'Central and Eastern Europe'),
('Russia', 'Central and Eastern Europe'),
('Rwanda', 'Sub-Saharan Africa'),
('Saudi Arabia', 'Middle East and Northern Africa'),
('Senegal', 'Sub-Saharan Africa'),
('Serbia', 'Central and Eastern Europe'),
('Sierra Leone', 'Sub-Saharan Africa'),
('Singapore', 'Southeastern Asia'),
('Slovakia', 'Central and Eastern Europe'),
('Slovenia', 'Central and Eastern Europe'),
('Somaliland region', 'Sub-Saharan Africa'),
('South Africa', 'Sub-Saharan Africa'),
('South Korea', 'Eastern Asia'),
('Spain', 'Western Europe'),
('Sri Lanka', 'Southern Asia'),
('Sudan', 'Sub-Saharan Africa'),
('Suriname', 'Latin America and Caribbean'),
('Swaziland', 'Sub-Saharan Africa'),
('Sweden', 'Western Europe'),
('Switzerland', 'Western Europe'),
('Syria', 'Middle East and Northern Africa'),
('Taiwan', 'Eastern Asia'),
('Tajikistan', 'Central and Eastern Europe'),
('Tanzania', 'Sub-Saharan Africa'),
('Thailand', 'Southeastern Asia'),
('Togo', 'Sub-Saharan Africa'),
('Trinidad and Tobago', 'Latin America and Caribbean'),
('Tunisia', 'Middle East and Northern Africa'),
('Turkey', 'Middle East and Northern Africa'),
('Turkmenistan', 'Central and Eastern Europe'),
('Uganda', 'Sub-Saharan Africa'),
('Ukraine', 'Central and Eastern Europe'),
('United Arab Emirates', 'Middle East and Northern Africa'),
('United Kingdom', 'Western Europe'),
('United States', 'North America'),
('Uruguay', 'Latin America and Caribbean'),
('Uzbekistan', 'Central and Eastern Europe'),
('Venezuela', 'Latin America and Caribbean'),
('Vietnam', 'Southeastern Asia'),
('Yemen', 'Middle East and Northern Africa'),
('Zambia', 'Sub-Saharan Africa'),
('Zimbabwe', 'Sub-Saharan Africa')}
Show a list of the countries with null values
list(happiness2018[happiness2018["Region"].isnull()]["Country"])
['Trinidad & Tobago', 'Belize', 'Northern Cyprus', 'Somalia', 'Namibia', 'South Sudan']
Map Regions based on the nearest neighbor
For example, Namibia: South Africa
Northern Cyprus: Cyprus
etc
NaN_region_dictionary_2018 ={'Trinidad & Tobago':'Latin America and Caribbean',
'Belize':'Latin America and Caribbean',
'Northern Cyprus':'Western Europe',
'Somalia':'Middle East and Northern Africa',
'Namibia':'Sub-Saharan Africa',
'South Sudan':'Sub-Saharan Africa'
}
happiness2018.loc[happiness2018["Region"].isnull(),"Region"] = list(NaN_region_dictionary_2018.values())
happiness2018.isnull().sum()
Rank 0 Country 0 Region 0 Happiness Score 0 GDP per Capita 0 Social Support 0 Health (Life Expectancy) 0 Freedom 0 Generosity 0 Trust 1 dtype: int64
Check is Null values are mapped correctly
happiness2018.iloc[[37, 48, 97, 118, 153]]
| Rank | Country | Region | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | |
|---|---|---|---|---|---|---|---|---|---|---|
| 37 | 38 | Trinidad & Tobago | Latin America and Caribbean | 6.192 | 1.223 | 1.492 | 0.564 | 0.575 | 0.171 | 0.019 |
| 48 | 49 | Belize | Latin America and Caribbean | 5.956 | 0.807 | 1.101 | 0.474 | 0.593 | 0.183 | 0.089 |
| 97 | 98 | Somalia | Middle East and Northern Africa | 4.982 | 0.000 | 0.712 | 0.115 | 0.674 | 0.238 | 0.282 |
| 118 | 119 | Namibia | Sub-Saharan Africa | 4.441 | 0.874 | 1.281 | 0.365 | 0.519 | 0.051 | 0.064 |
| 153 | 154 | South Sudan | Sub-Saharan Africa | 3.254 | 0.337 | 0.608 | 0.177 | 0.112 | 0.224 | 0.106 |
Let's fix the Null values in happiness2019.
Show the Regions with Null values
list(happiness2019["Country"][happiness2019["Region"].isnull()])
['Trinidad & Tobago', 'Northern Cyprus', 'North Macedonia', 'Somalia', 'Namibia', 'Gambia', 'South Sudan']
Map Regions based on the nearest neighbor
Nan_region_dictionary_2019 ={'Trinidad & Tobago':'Latin America and Caribbean',
'Northern Cyprus':'Western Europe',
'North Macedonia': 'Western Europe',
'Somalia':'Middle East and Northern Africa',
'Namibia':'Sub-Saharan Africa',
'Gambia':'Sub-Saharan Africa',
'South Sudan':'Sub-Saharan Africa',
}
happiness2019.loc[happiness2019["Region"].isnull(),"Region"] = list(Nan_region_dictionary_2019.values())
Check is the Null values are present
happiness2019.isnull().sum()
Rank 0 Country 0 Region 0 Happiness Score 0 GDP per Capita 0 Social Support 0 Health (Life Expectancy) 0 Freedom 0 Generosity 0 Trust 0 dtype: int64
happiness2019.iloc[[38, 63, 83, 111, 112, 119, 155]]
| Rank | Country | Region | Happiness Score | GDP per Capita | Social Support | Health (Life Expectancy) | Freedom | Generosity | Trust | |
|---|---|---|---|---|---|---|---|---|---|---|
| 38 | 39 | Trinidad & Tobago | Latin America and Caribbean | 6.192 | 1.231 | 1.477 | 0.713 | 0.489 | 0.185 | 0.016 |
| 63 | 64 | Northern Cyprus | Western Europe | 5.718 | 1.263 | 1.252 | 1.042 | 0.417 | 0.191 | 0.162 |
| 83 | 84 | North Macedonia | Western Europe | 5.274 | 0.983 | 1.294 | 0.838 | 0.345 | 0.185 | 0.034 |
| 111 | 112 | Somalia | Middle East and Northern Africa | 4.668 | 0.000 | 0.698 | 0.268 | 0.559 | 0.243 | 0.270 |
| 112 | 113 | Namibia | Sub-Saharan Africa | 4.639 | 0.879 | 1.313 | 0.477 | 0.401 | 0.070 | 0.056 |
| 119 | 120 | Gambia | Sub-Saharan Africa | 4.516 | 0.308 | 0.939 | 0.428 | 0.382 | 0.269 | 0.167 |
| 155 | 156 | South Sudan | Sub-Saharan Africa | 2.853 | 0.306 | 0.575 | 0.295 | 0.010 | 0.202 | 0.091 |
Stage 3: Data Visualization¶
The Happiness Score is varies depending on 6 factors:
- Economy
- Health
- Family
- Freedom
- Generosity
- Trust
Plotly
happiness2015["Year"]=2015
happiness2015=happiness2015.set_index("Year")
happiness2016["Year"]=2016
happiness2016=happiness2016.set_index("Year")
happiness2017["Year"]=2017
happiness2017=happiness2017.set_index("Year")
happiness2018["Year"]=2018
happiness2018=happiness2018.set_index("Year")
happiness2019["Year"]=2019
happiness2019=happiness2019.set_index("Year")
happiness_merge = pd.concat([happiness2015, happiness2016, happiness2017, happiness2018, happiness2019])
happiness_merge.columns
Index(['Rank', 'Country', 'Region', 'Happiness Score', 'GDP per Capita',
'Social Support', 'Health (Life Expectancy)', 'Freedom', 'Generosity',
'Trust'],
dtype='object')
happiness_merge.Region.unique()
array(['Western Europe', 'North America', 'Australia and New Zealand',
'Middle East and Northern Africa', 'Latin America and Caribbean',
'Southeastern Asia', 'Central and Eastern Europe', 'Eastern Asia',
'Sub-Saharan Africa', 'Southern Asia', nan], dtype=object)
fig = px.line(happiness_merge[happiness_merge["Region"] == "Southeastern Asia"].reset_index(),
x="Year",
y="Happiness Score",
color='Country',
title="Southeastern Asia")
fig.update_layout(
xaxis = dict(
tickmode = 'linear',
tick0 = 2015,
dtick = 1
)
)
fig.show()
fig = px.line(happiness_merge[happiness_merge["Region"] == "Australia and New Zealand"].reset_index(),
x="Year",
y="Happiness Score",
color='Country',
title="Australia and New Zealand")
fig.update_layout(
xaxis = dict(
tickmode = 'linear',
tick0 = 2015,
dtick = 1
)
)
fig.show()
px.scatter(happiness_merge[happiness_merge["Region"].isin(['Western Europe',
'North America',
'Australia and New Zealand',
'Middle East and Northern Africa',
'Latin America and Caribbean',
'Southeastern Asia', 'Central and Eastern Europe',
'Eastern Asia',
'Sub-Saharan Africa',
'Southern Asia'])].reset_index(),
x ="GDP per Capita",
y ="Happiness Score",
color ="Region",
size = "Happiness Score",
hover_name = "Country",
animation_frame = "Year",
animation_group = "Country",
range_x = [0,1.8],
range_y =[3,8])
-Happiness Scores is consistently correlated with Economy & Health factors, i.e. more developed the country is economically and with better healthcare, higher the scores.
Most Continents are unevenly distributed as Regions, and some countries have missing scores across years, so Average Happiness score has been used.
Random Observation: Top Happy Countries are clustered away from Equator line, also where population density is lower. Since population parameter is not observed in the dataset, no correlation can be clearly drawn.
Factors like “years since last war/insurgency”, “ecological ownership of nature”, “freedom in terms of democracy & human rights” could be introduced as qualitative aspects in scoring Happiness.